NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Enabling Efficient Error-Controlled Lossy Compression for Unstructured Scientific Data

https://doi.org/10.1109/IPDPS64566.2025.00040

Wu, Xuan; Di, Sheng; Ren, Congrong; Jiao, Pu; Xia, Mingze; Wang, Cheng; Guo, Hanqi; Liang, Xin; Cappello, Franck (June 2025, IEEE)

Free, publicly-accessible full text available June 3, 2026
Enabling Efficient Error-controlled Lossy Compression for Unstructured Scientific Data

Wu, Xuan Wu; Di, Sheng; Ren, Congrong; Jiao, Pu; Xia, Mingze; Wang, Cheng; Guo, Hanqi; Liang, Xin; Cappello, Franck (June 2025, IEEE)

Free, publicly-accessible full text available June 3, 2026
LCP: Enhancing Scientific Data Management with L ossy C ompression for P articles

https://doi.org/10.1145/3709700

Zhang, Longtao; Li, Ruoyu; Ren, Congrong; Di, Sheng; Liu, Jinyang; Huang, Jiajun; Underwood, Robert; Grosset, Pascal; Tao, Dingwen; Liang, Xin; et al (February 2025, Proceedings of the ACM on Management of Data)

Many scientific applications opt for particles instead of meshes as their basic primitives to model complex systems composed of billions of discrete entities. Such applications span a diverse array of scientific domains, including molecular dynamics, cosmology, computational fluid dynamics, and geology. The scale of the particles in those scientific applications increases substantially thanks to the ever-increasing computational power in high-performance computing (HPC) platforms. However, the actual gains from such increases are often undercut by obstacles in data management systems related to data storage, transfer, and processing. Lossy compression has been widely recognized as a promising solution to enhance scientific data management systems regarding such challenges, although most existing compression solutions are tailored for Cartesian grids and thus have sub-optimal results on discrete particle data. In this paper, we introduce LCP, an innovative lossy compressor designed for particle datasets, offering superior compression quality and higher speed than existing compression solutions. Specifically, our contribution is threefold. (1) We propose LCP-S, an error-bound aware block-wise spatial compressor to efficiently reduce particle data size while satisfying the pre-defined error criteria. This approach is universally applicable to particle data across various domains, eliminating the need for reliance on specific application domain characteristics. (2) We develop LCP, a hybrid compression solution for multi-frame particle data, featuring dynamic method selection and parameter optimization. It aims to maximize compression effectiveness while preserving data quality as much as possible by utilizing both spatial and temporal domains. (3) We evaluate our solution alongside eight state-of-the-art alternatives on eight real-world particle datasets from seven distinct domains. The results demonstrate that our solution achieves up to 104% improvement in compression ratios and up to 593% increase in speed compared to the second-best option, under the same error criteria.
more » « less
Free, publicly-accessible full text available February 10, 2026
A Prediction‐Traversal Approach for Compressing Scientific Data on Unstructured Meshes with Bounded Error

https://doi.org/10.1111/cgf.15097

Ren, Congrong; Liang, Xin; Guo, Hanqi (June 2024, Computer Graphics Forum)

We explore an error‐bounded lossy compression approach for reducing scientific data associated with 2D/3D unstructured meshes. While existing lossy compressors offer a high compression ratio with bounded error for regular grid data, methodologies tailored for unstructured mesh data are lacking; for example, one can compress nodal data as 1D arrays, neglecting the spatial coherency of the mesh nodes. Inspired by the SZ compressor, which predicts and quantizes values in a multidimensional array, we dynamically reorganize nodal data into sequences. Each sequence starts with a seed cell; based on a predefined traversal order, the next cell is added to the sequence if the current cell can predict and quantize the nodal data in the next cell with the given error bound. As a result, one can efficiently compress the quantized nodal data in each sequence until all mesh nodes are traversed. This paper also introduces a suite of novel error metrics, namely continuous mean squared error (CMSE) and continuous peak signal‐to‐noise ratio (CPSNR), to assess compression results for unstructured mesh data. The continuous error metrics are defined by integrating the error function on all cells, providing objective statistics across nonuniformly distributed nodes/cells in the mesh. We evaluate our methods with several scientific simulations ranging from ocean‐climate models and computational fluid dynamics simulations with both traditional and continuous error metrics. We demonstrated superior compression ratios and quality than existing lossy compressors.
more » « less
Full Text Available
A Survey on Error-Bounded Lossy Compression for Scientific Datasets

https://doi.org/10.1145/3733104

Di, Sheng; Liu, Jinyang; Zhao, Kai; Liang, Xin; Underwood, Robert; Zhang, Zhaorui; Shah, Milan; Huang, Yafan; Huang, Jiajun; Yu, Xiaodong; et al (May 2025, ACM Computing Surveys)

Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving the reconstructed data fidelity very well. Many error-bounded lossy compressors have been developed for a wide range of parallel and distributed use cases for years. They are designed with distinct compression models and principles, such that each of them features particular pros and cons. In this paper we provide a comprehensive survey of emerging error-bounded lossy compression techniques. The key contribution is fourfold. (1) We summarize a novel taxonomy of lossy compression into 6 classic models. (2) We provide a comprehensive survey of 10 commonly used compression components/modules. (3) We summarized pros and cons of 47 state-of-the-art lossy compressors and present how state-of-the-art compressors are designed based on different compression techniques. (4) We discuss how customized compressors are designed for specific scientific applications and use-cases. We believe this survey is useful to multiple communities including scientific applications, high-performance computing, lossy compression, and big data.
more » « less
Free, publicly-accessible full text available May 2, 2026

Search for: All records